ci(e2e): add a curated @smoke gating leg, make the full generic leg informational by jacekradko · Pull Request #8759 · clerk/javascript

jacekradko · 2026-06-05T03:44:39Z

The generic leg runs ~210 tests against one shared staging instance, so any single post-retry flake reds the whole matrix. That is the main reason the staging run has been ~100% red and gives no signal.

This splits the gate. A handful of stable core-auth happy paths are tagged @smoke and run as their own small matrix leg against just the two core apps, and the full generic leg becomes continue-on-error so it still runs and uploads results but no longer fails the run. Because only gating legs (smoke plus the framework legs) can fail the matrix job, needs.integration-tests.result is failure only on a real gating-leg failure, so the existing report job notifies on meaningful failures instead of on generic-only flake. The clerk_go commit-status block stays commented out; when it is wired up (follow-up) it inherits the same gating semantics.

The load-bearing choice here is the @smoke set, since it is what gates. I kept it deliberately conservative: five happy paths that never appeared in a failed/flaky list across the recent runs (sign-in with password, with instant password, the modal variant, sign-up with password, and a single-flow sign-out), run against react.vite.withEmailCodes and next.appRouter.withEmailCodes. It is just tags, so it should grow as the rate-limit flake is brought under control. A few tests now run in both legs (smoke and the informational generic), which is cheap insurance and lets the smoke result come from the clean, low-load leg rather than the noisy 210-test one.

Stacked on #8757.

changeset-bot · 2026-06-05T03:44:44Z

🦋 Changeset detected

Latest commit: db5acc0

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 0 packages

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

vercel · 2026-06-05T03:44:45Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
clerk-js-sandbox	Ready	Preview, Comment	Jun 5, 2026 11:43am

coderabbitai · 2026-06-05T03:44:47Z

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Repository YAML (base), Repository UI (inherited)

Review profile: CHILL

Plan: Pro

Run ID: 38821bb3-2599-4af8-89b9-d3cf2921ac40

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

@latest

The staging e2e "generic" leg was red on ~100% of runs because a few independent failures sat behind an all-or-nothing gate: - whatsapp-phone-code: the WhatsApp channel is not enabled on the staging instance, so the button never renders and the suite times out every run. It also bypasses the isStagingReady graceful-skip, so skip it explicitly on staging until the channel is provisioned. - custom-pages "survives a parent rerender": validates an unreleased @clerk/react fix (#8604), but the staging leg installs published @latest, so it is deterministically red until release. Skip when E2E_SDK_SOURCE=latest; PR CI (ref builds) still covers it. - concurrency was keyed on ref (effectively always "main") with cancel-in-progress, so each new staging deploy cancelled the in-flight run and no commit could report a status. Key on the clerk_go commit instead. - raise the job timeout above the 25-minute test step so the job cap no longer kills runs mid-suite. - emit and upload a JSON Playwright report in CI so the report job can classify failures (flaky vs failed, infra vs regression) later.

validate-staging-instances.mjs already diffs prod vs staging /v1/environment but every exit path returned 0, so detected drift blocked nothing and the job was not a dependency of the test matrix. A drifted staging mirror (e.g. a missing phone_number WhatsApp channel) therefore surfaced only as opaque test timeouts 200 tests deep. Add a tight CRITICAL_PATHS allowlist (attribute enabled toggles, phone_number.channels, auth factors/strategies, social enable/disable, password settings) and an ACCEPTED_DRIFT escape hatch so known gaps don't block while new drift does. In strict mode the script exits non-zero on a blocking mismatch; fetch failures and cosmetic drift never fail the build. Wire integration-tests to need validate-instances, and drive strictness from the STAGING_VALIDATE_STRICT repo variable (default report-only). So this is a no-op until the team opts in: it logs blocking drift and the proposed gate without failing anything. Flip the variable to make it enforce.

@smoke

…nformational The generic leg runs ~210 tests against one shared staging instance, so any single post-retry flake reds the whole matrix and the staging run has been ~100% red, giving no signal. Split the gate: tag a handful of stable core-auth happy paths @smoke and run them as their own small matrix leg against just the two core apps, and mark the full generic leg continue-on-error so it still runs and uploads results but no longer fails the run. Because only gating legs (smoke + the framework legs) can fail the matrix job, needs.integration-tests.result is 'failure' only on a real gating-leg failure, so the report job's Slack notification fires on meaningful failures instead of on generic-only flake. The initial @smoke set is intentionally conservative (sign-in with password / instant password / modal, sign-up with password, sign-out) and is just tags, so it can grow as flake is brought under control.

github-actions Bot added actions integration labels Jun 5, 2026

vercel Bot deployed to Preview June 5, 2026 03:45 View deployment

jacekradko mentioned this pull request Jun 5, 2026

ci(e2e): classify staging e2e results into a structured digest #8760

Open

jacekradko added 2 commits June 5, 2026 06:40

jacekradko force-pushed the jacek/staging-e2e-validate-gate branch from 07c335c to 0eb5396 Compare June 5, 2026 11:42

jacekradko force-pushed the jacek/staging-e2e-smoke-leg branch from 3d2c6ea to db5acc0 Compare June 5, 2026 11:42

vercel Bot deployed to Preview June 5, 2026 11:43 View deployment

jacekradko force-pushed the jacek/staging-e2e-validate-gate branch from bf10aa7 to 9353469 Compare June 11, 2026 17:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ci(e2e): add a curated @smoke gating leg, make the full generic leg informational#8759

ci(e2e): add a curated @smoke gating leg, make the full generic leg informational#8759
jacekradko wants to merge 3 commits into
jacek/staging-e2e-validate-gatefrom
jacek/staging-e2e-smoke-leg

jacekradko commented Jun 5, 2026

Uh oh!

changeset-bot Bot commented Jun 5, 2026 •

edited

Loading

Uh oh!

vercel Bot commented Jun 5, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot commented Jun 5, 2026 •

edited

Loading

Review skipped

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

jacekradko commented Jun 5, 2026

Uh oh!

changeset-bot Bot commented Jun 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🦋 Changeset detected

Uh oh!

vercel Bot commented Jun 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai Bot commented Jun 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

changeset-bot Bot commented Jun 5, 2026 •

edited

Loading

vercel Bot commented Jun 5, 2026 •

edited

Loading

coderabbitai Bot commented Jun 5, 2026 •

edited

Loading